AITopics | entity list

Collaborating Authors

entity list

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Explain Less, Understand More: Jargon Detection via Personalized Parameter-Efficient Fine-tuning

Wu, Bohao, Wang, Qingyun, Guo, Yue

arXiv.org Artificial IntelligenceOct-14-2025

Personalizing jargon detection and explanation is essential for making technical documents accessible to readers with diverse disciplinary backgrounds. However, tailoring models to individual users typically requires substantial annotation efforts and computational resources due to user-specific finetuning. To address this, we present a systematic study of personalized jargon detection, focusing on methods that are both efficient and scalable for real-world deployment. We explore two personalization strategies: (1) lightweight finetuning using Low-Rank Adaptation (LoRA) on open-source models, and (2) personalized prompting, which tailors model behavior at inference time without retaining. To reflect realistic constraints, we also investigate semi-supervised approaches that combine limited annotated data with self-supervised learning from users' publications. Our personalized LoRA model outperforms GPT-4 with contextual prompting by 21.4% in F1 score and exceeds the best performing oracle baseline by 8.3%. Remarkably, our method achieves comparable performance using only 10% of the annotated training data, demonstrating its practicality for resource-constrained settings. Our study offers the first work to systematically explore efficient, low-resource personalization of jargon detection using open-source language models, offering a practical path toward scalable, user-adaptive NLP system.

annotator, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.16227

Country:

North America > United States (0.29)
North America > Mexico > Mexico City (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

US unleashes another crackdown on China's chip industry

Al JazeeraDec-2-2024, 21:54:41 GMT

The United States has launched its third crackdown in three years on China's semiconductor industry, curbing exports to 140 companies, including chip equipment maker Naura Technology Group, among other moves. The latest effort on Monday to hobble Beijing's chipmaking ambitions also hits Chinese chip toolmakers Piotech, ACM Research and SiCarrier Technology with new export restrictions as part of the package, which also takes aim at shipments of advanced memory chips and more chipmaking tools to China. The move is one of President Joe Biden's last large-scale efforts to stymie China's ability to access and produce chips that can help advance artificial intelligence for military applications, or otherwise threaten US national security. It comes just weeks before the swearing-in of Republican President-elect Donald Trump, who is expected to retain many of Biden's tough-on-China measures. The package includes curbs on China-bound shipments of high bandwidth memory (HBM) chips, critical for high-end applications like AI training; curbs on 24 additional chipmaking tools and three software tools; and export curbs on chipmaking equipment made in countries such as Singapore and Malaysia.

china, export, restriction, (15 more...)

Al Jazeera

Country:

North America > United States (1.00)
Asia > Singapore (0.26)
Asia > Malaysia (0.26)
(8 more...)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

Annotation Guidelines for Corpus Novelties: Part 2 -- Alias Resolution Version 1.0

Amalvy, Arthur, Labatut, Vincent

arXiv.org Artificial IntelligenceOct-1-2024

This document aims at providing instructions for the annotation of aliases in the Novelties corpus. The corpus itself will be the object of a separate description. It was constituted mainly to fulfill two goals: in the short term, train and test NLP methods able to handle long texts, and in the longer term, be used to develop Renard [2], a pipeline aiming at extracting character networks from literary fiction. This pipeline includes several processing steps besides alias resolution, including named entity recognition and coreference resolution. Character networks can be used to tackle a number of tasks, including the assessment of literary theories, the level of historicity of a narrative, detecting roles in stories, classifying novels, identify subplots, segment a storyline, summarize a story, design recommendation systems, align narratives, etc. See the detailed survey of Labatut and Bost [6] for more information regarding character networks. There are seldom annotation guidelines for alias resolution in the literature, so the one presented here are designed from scratch, taking into account this application's context.

annotation guideline, canonical form, musketeer, (11 more...)

arXiv.org Artificial Intelligence

2410.00522

Country:

Europe > France > Hauts-de-France (0.05)
Europe > Austria (0.05)
North America > Greenland (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry: Consumer Products & Services (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.55)

Add feedback

Chain-of-Layer: Iteratively Prompting Large Language Models for Taxonomy Induction from Limited Examples

Zeng, Qingkai, Bai, Yuyang, Tan, Zhaoxuan, Feng, Shangbin, Liang, Zhenwen, Zhang, Zhihan, Jiang, Meng

arXiv.org Artificial IntelligenceFeb-11-2024

Automatic taxonomy induction is crucial for web search, recommendation systems, and question answering. Manual curation of taxonomies is expensive in terms of human effort, making automatic taxonomy construction highly desirable. In this work, we introduce Chain-of-Layer which is an in-context learning framework designed to induct taxonomies from a given set of entities. Chain-of-Layer breaks down the task into selecting relevant candidate entities in each layer and gradually building the taxonomy from top to bottom. To minimize errors, we introduce the Ensemble-based Ranking Filter to reduce the hallucinated content generated at each iteration. Through extensive experiments, we demonstrate that Chain-of-Layer achieves state-of-the-art performance on four real-world benchmarks.

entity list, taxonomy, taxonomy induction, (13 more...)

arXiv.org Artificial Intelligence

2402.07386

Country:

North America > United States > Indiana > St. Joseph County > Notre Dame (0.05)
North America > United States > Washington > King County > Seattle (0.04)
Europe > North Macedonia > Skopje Statistical Region > Skopje Municipality > Skopje (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)

Add feedback

Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition

Xu, Tianyi, Yang, Zhanheng, Huang, Kaixun, Guo, Pengcheng, Zhang, Ao, Li, Biao, Chen, Changru, Li, Chao, Xie, Lei

arXiv.org Artificial IntelligenceAug-15-2023

The introduced entity encoder enables the entity list to be By incorporating additional contextual information, deep biasing personalized for individual users. However, this personalization methods have emerged as a promising solution for speech comes at a cost: the model has less prior knowledge of the customized recognition of personalized words. However, for real-world words, which can result in false alarms. In other words, voice assistants, always biasing on such personalized words the model may mistakenly identify non-entity names as entity with high prediction scores can significantly degrade the performance terms, leading to a decrease in overall recognition performance, of recognizing common words. To address this issue, particularly for words that are phonemically similar. For example, we propose an adaptive contextual biasing method based if we add "José" as a context phrase, the ASR system on Context-Aware Transformer Transducer (CATT) that utilizes might falsely recognize "O say can you see" as "José can you the biased encoder and predictor embeddings to perform see". This issue is particularly acute for a general ASR system streaming prediction of contextual phrase occurrences. Such that is not restricted to a particular domain. As a result, this prediction is then used to dynamically switch the bias list on and drawback makes biased models less competitive, as the benefits off, enabling the model to adapt to both personalized and common gained may be outweighed by the negative impact on overall scenarios.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.00804

Country: Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

How Shady Chinese Encryption Chips Got Into the Navy, NATO, and NASA

WIREDJun-15-2023, 10:00:00 GMT

From TikTok to Huawei routers to DJI drones, rising tensions between China and the US have made Americans--and the US government--increasingly wary of Chinese-owned technologies. But thanks to the complexity of the hardware supply chain, encryption chips sold by the subsidiary of a company specifically flagged in warnings from the US Department of Commerce for its ties to the Chinese military have found their way into the storage hardware of military and intelligence networks across the West. In July of 2021, the Commerce Department's Bureau of Industry and Security added the Hangzhou, China-based encryption chip manufacturer Hualan Microelectronics, also known as Sage Microelectronics, to its so-called "Entity List," a vaguely named trade restrictions list that highlights companies "acting contrary to the foreign policy interests of the United States." Specifically, the bureau noted that Hualan had been added to the list for "acquiring and ... attempting to acquire US-origin items in support of military modernization for [China's] People's Liberation Army." Yet nearly two years later, Hualan--and in particular its subsidiary known as Initio, a company originally headquartered in Taiwan that it acquired in 2016--still supplies encryption microcontroller chips to Western manufacturers of encrypted hard drives, including several that list as customers on their websites Western governments' aerospace, military, and intelligence agencies: NASA, NATO, and the US and UK militaries.

entity list, government, us government, (15 more...)

WIRED

Country:

Asia > Taiwan (0.26)
Asia > China > Zhejiang Province > Hangzhou (0.26)
North America > United States > District of Columbia > Washington (0.06)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Communications > Social Media (0.79)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.53)

Add feedback

Entity-to-Text based Data Augmentation for various Named Entity Recognition Tasks

Hu, Xuming, Jiang, Yong, Liu, Aiwei, Huang, Zhongqiang, Xie, Pengjun, Huang, Fei, Wen, Lijie, Yu, Philip S.

arXiv.org Artificial IntelligenceMay-26-2023

Data augmentation techniques have been used to alleviate the problem of scarce labeled data in various NER tasks (flat, nested, and discontinuous NER tasks). Existing augmentation techniques either manipulate the words in the original text that break the semantic coherence of the text, or exploit generative models that ignore preserving entities in the original text, which impedes the use of augmentation techniques on nested and discontinuous NER tasks. In this work, we propose a novel Entity-to-Text based data augmentation technique named EnTDA to add, delete, replace or swap entities in the entity list of the original texts, and adopt these augmented entity lists to generate semantically coherent and entity preserving texts for various NER tasks. Furthermore, we introduce a diversity beam search to increase the diversity during the text generation process. Experiments on thirteen NER datasets across three tasks (flat, nested, and discontinuous NER tasks) and two settings (full data and low resource settings) show that EnTDA could bring more performance improvements compared to the baseline augmentation techniques.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2210.10343

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

VicunaNER: Zero/Few-shot Named Entity Recognition using Vicuna

Ji, Bin

arXiv.org Artificial IntelligenceMay-4-2023

Large Language Models (LLMs, e.g., ChatGPT) have shown impressive zero- and few-shot capabilities in Named Entity Recognition (NER). However, these models can only be accessed via online APIs, which may cause data leak and non-reproducible problems. In this paper, we propose VicunaNER, a zero/few-shot NER framework based on the newly released open-source LLM -- Vicuna. VicunaNER is a two-phase framework, where each phase leverages multi-turn dialogues with Vicuna to recognize entities from texts. We name the second phase as Re-Recognition, which recognizes those entities not recognized in the first phase (a.k.a. Recognition). Moreover, we set entity correctness check dialogues in each phase to filter out wrong entities. We evaluate VicunaNER's zero-shot capacity on 10 datasets crossing 5 domains and few-shot capacity on Few-NERD. Experimental results demonstrate that VicunaNER achieves superior performance in both shot settings. Additionally, we conduct comprehensive investigations on Vicuna from multiple perspectives.

large language model, machine learning, vicuna, (19 more...)

arXiv.org Artificial Intelligence

2305.03253

Country:

Europe > United Kingdom > England (0.06)
North America > United States > Washington > King County > Seattle (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Training a Named Entity Recognition Model Without Data

#artificialintelligenceFeb-12-2023, 01:05:18 GMT

Named Entity Recognition(NER) is the task of recognizing entity names, such as person name, locations, and organizations, within a text. This task serves as a fundamental module for various NLP applications including chatbots, search engines, and translation systems. We can find NER datasets for generic entities easily, but obtaining data for specific domains can be challenging. Labeling NER data is more difficult than simple text classification, making it challenging to create large-scale domain-specific NER datasets. In this post, I will demonstrate how to train NER model without any labeled data.

dataset, entity name, ner dataset, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.94)

Add feedback

Drones take center stage in U.S.-China war on data harvesting

The Japan TimesDec-20-2021, 03:04:57 GMT

In video reviews of the latest drone models to his 80,000 YouTube subscribers, Indiana college student Carson Miller doesn't seem like an unwitting tool of Chinese spies. Yet that's how the U.S. is increasingly viewing him and thousands of other Americans who purchase drones built by Shenzhen-based SZ DJI Technology Co., the world's top producer of unmanned aerial vehicles. Miller, who bought his first DJI model in 2016 for $500 and now owns six of them, shows why the company controls more than half of the U.S. drone market. "If tomorrow DJI were completely banned," the 21-year-old said, "I would be pretty frightened." Critics of DJI warn the dronemaker may be channeling reams of sensitive data to Chinese intelligence agencies on everything from critical infrastructure like bridges and dams to personal information such as heart rates and facial recognition.

dji, drone, information, (14 more...)

The Japan Times

Country:

North America > United States > Indiana (0.25)
Asia > China > Guangdong Province > Shenzhen (0.24)
Asia > China > Beijing > Beijing (0.05)
(7 more...)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(4 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)

Add feedback